Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing

نویسندگان

  • Ashish Gupta
  • Fan Yang
  • Jason Govig
  • Adam Kirsch
  • Kelvin Chan
  • Kevin Lai
  • Shuo Wu
  • Sandeep Govind Dhoot
  • Abhilash Rajesh Kumar
  • Ankur Agiwal
  • Sanjay Bhansali
  • Mingsheng Hong
  • Jamie Cameron
  • Masood Siddiqi
  • David Jones
  • Jeff Shute
  • Andrey Gubarev
  • Shivakumar Venkataraman
  • Divyakant Agrawal
چکیده

Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google’s Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Data Warehousing: A New Breed of Decision Support

Active data warehousing is rapidly changing the landscape for deployment of decision support solutions. The trend toward actionable business intelligence demands that capabilities for tactical and event-driven decision-making be supported in addition to traditional uses of the data warehouse for strategic decision-making. The resulting challenges to deliver extreme service levels in the areas o...

متن کامل

Stronger Semantics for Low-Latency Geo-Replicated Storage

We present the first scalable, geo-replicated storage system that guarantees low latency, offers a rich data model, and provides “stronger” semantics. Namely, all client requests are satisfied in the local datacenter in which they arise; the system efficiently supports useful data model abstractions such as column families and counter columns; and clients can access data in a causallyconsistent...

متن کامل

Transactions with Consistency Choices on Geo-Replicated Cloud Storage

Pileus is a replicated and scalable key-value storage system that features geo-replicated transactions with varying degrees of consistency chosen by applications. Each transaction reads from a snapshot selected based on its requested consistency, from strong to eventual consistency or intermediate guarantees such as read-my-writes, monotonic, bounded, and causal.

متن کامل

Real-time workflow audit data integration into data warehouse systems

Workflow management systems are being increasingly used by many organizations to automate business processes and decrease costs. Audit trails from workflow management systems include significant amounts of information that can be used to analyze and monitor the performance of business processes in order to improve the efficiency. Traditional approaches for using workflow audit trail for decisio...

متن کامل

Near Real-time Data Warehousing with Multi-stage Trickle & Flip

A data warehouse typically is a collection of historical data designed for decision support, so it is updated from the sources periodically, mostly on a daily basis. Today’s business however asks for fresher data. Real-time warehousing is one of the trends to accomplish this, but there are a number of challenges to move towards true real-time. This paper proposes ‘Multi-stage Trickle & flip’ me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2014